Skip to main content

Tables - tabulate

The script below demonstrates different ways to use the tabulate command to create descriptive statistics.

The command is used on categorical variables to produce frequency tables, both unidimensional and multidimensional. By combining with the summarize() option, you can also create volume tables.

Both numeric and alphanumeric variables can be used as input, but the number of cells in the table must not exceed a certain limit. E.g. you can make a table of all municipalities (distributed by gender etc), but not of all education codes.

 require no.ssb.fdb:23 as db

create-dataset demography
import db/INNTEKT_BRUTTOFORM 2020-01-01 as wealth
import db/BEFOLKNING_KJOENN as gender
import db/BEFOLKNING_FOEDSELS_AAR_MND as birthdate
import db/SIVSTANDFDT_SIVSTAND 2016-01-01 as sivstatus16
import db/SIVSTANDFDT_SIVSTAND 2020-01-01 as sivstatus20
import db/BOSATTEFDT_BOSTED 2020-01-01 as municipality
import db/BEFOLKNING_STATUSKODE 2020-01-01 as regstat

// Recode from municipality to county level 
generate county = substr(municipality,1, 2)


// Generate descriptive statistics

// Frequency tables - oneway and twoway
// The best way to become familiar with discrete variables is through frequency tables. These show the number of units for each category, and also provide an overview of which categories are used for the relevant variable. One can look at individual variables through one-way tables, but also combine two or more in one and the same table (cross tables). This gives an insight into how the frequencies are distributed, controlled for values of other variables

define-labels countystring '03' Oslo '11' Rogaland '15' 'Møre og Romsdal' '18' Nordland '30' Viken '34' Innlandet '38' 'Vestfold og Telemark' '42' Agder '46' Vestland '50' Trøndelag '54' 'Troms og Finnmark' '21' Spitsbergen '25' 'Education abroad' '99' Unknown

assign-labels county countystring

tabulate county
tabulate gender
tabulate gender regstat
tabulate county gender

// Crosstable showing only category values without lables
tabulate gender regstat county, nolabels

// Crosstable including missing values 
tabulate county regstat, missing

// Crosstable showing result only for persons over 30 years of age
generate age = 2020 - int(birthdate/100)
tabulate county regstat if age > 30

// Percentage tables
tabulate sivstatus16 sivstatus20, rowpct
tabulate sivstatus16 sivstatus20, colpct
tabulate sivstatus16 sivstatus20, cellpct
tabulate sivstatus16 sivstatus20, rowpct freq

// The tabulate command can also be used to create volume tables when combining with summarize. This will show statistics like mean values etc for a variable divided by categories specified through the chosen tabulate variable(s)

tabulate county gender, summarize(wealth)